INDEX
Explanations
instances of the word "tank."
references to tanks
New Auto-Interp
Negative Logits
âĸ¬
-0.92
BOOK
-0.72
————
-0.68
teness
-0.67
phony
-0.67
ãĥ¬
-0.66
Brotherhood
-0.65
theless
-0.65
Lod
-0.65
çīĪ
-0.65
POSITIVE LOGITS
erness
1.22
ard
1.00
mates
0.99
ards
0.99
ered
0.94
yard
0.94
agers
0.93
tank
0.91
ering
0.90
tops
0.85
Activations Density 0.040%