INDEX
Explanations
words related to agreements or statements
occurrences of the word "statement" in various contexts
New Auto-Interp
Negative Logits
cannibal
-0.67
turkey
-0.66
sung
-0.65
âĺ
-0.65
drink
-0.65
experienced
-0.64
âĺ
-0.63
Ïī
-0.62
rotation
-0.60
âĻ
-0.60
POSITIVE LOGITS
ements
4.98
ement
2.25
EMENT
1.79
ments
1.45
ancies
1.18
ures
1.12
aments
1.09
etimes
1.07
itures
1.06
urers
1.04
Activations Density 0.020%