« 映画 Oppenheimer | Main

October 14, 2023

Here are some corrections regarding "Detecting Fake News — with a BERT Model"

I managed to  get through this https://medium.com/@skillcate/detecting-fake-news-with-a-bert-model-9c666e3cdd9b.

Here are corrections I made to "make it work" (to quote Tim Gunn).

1. In the first chunk, I made the following change: 

(This works) from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
(This does not work) # "from sklearn.metrics import plot_confusion_matrix" , which was shown in the post is deprecated (I am writing this in October 2023).
plot_confusion_matrix does not work.
This BERT for dummies post is dated about an year ago, and yet Scikit-learn does not like plot_confusion_matrix anymore... 
2. Before I ran the first chunk, I created a folder and directory in my Google Drive--I am using Google Colab--, so that the followinig cd works.
%cd /content/drive/MyDrive/1_LiveProjects/Project11_FakeNewsDetection
2. In the second chunk, the first two lines to load data did not work at first try.
Of course because I did not have these csv files, duh! 
First, you have to download the files from Kaggle. Also, "a1_True.csv" and "a2_Fake.csv" do not work unless you renamed Kaggle datasets. You rather need:
true_data = pd.read_csv('True.csv')
fake_data = pd.read_csv('Fake.csv')
3. The fine tuning segment that begins with "# Train and predict" take time. In my case, it took over an hour. 
4. The segment that builds a classification report, which begins with "# load weights of best model" gives an error.
Specifically this line does not work:
# path = 'c1_fakenews_weights.pt'
because in the precedeing segment, the path was given in a different name, "c2_new_model_weights.pt."
So, I rather have to write: 
path = 'c2_new_model_weights.pt'

« 映画 Oppenheimer | Main

Comments

The comments to this entry are closed.