INDEX
Explanations
sentences that transition between different topics or ideas
phrases that contain the word "there."
New Auto-Interp
Negative Logits
CJ
-0.81
±
-0.66
pound
-0.63
correct
-0.60
ONSORED
-0.60
franc
-0.59
IPM
-0.58
+/-
-0.58
cogn
-0.57
beans
-0.57
POSITIVE LOGITS
abouts
1.41
upon
1.09
fore
0.86
hovah
0.80
ngth
0.80
guiActiveUn
0.78
olkien
0.76
choes
0.74
Ukrain
0.72
iltr
0.72
Activations Density 0.147%