INDEX
Explanations
instances of words related to public statements or actions
instances of the word "publicly."
New Auto-Interp
Negative Logits
nesota
-0.90
nian
-0.76
tein
-0.72
illas
-0.71
ĸļ
-0.71
ailand
-0.70
Salvador
-0.70
anwhile
-0.69
Stick
-0.69
akeru
-0.69
POSITIVE LOGITS
humiliated
0.80
traded
0.80
ached
0.78
speaking
0.76
ised
0.73
isSpecialOrderable
0.73
licted
0.72
addressed
0.71
dispersed
0.71
pronounce
0.70
Activations Density 0.010%