INDEX
Explanations
proper nouns or names related to a particular individual
mentions of the name "Strong" in relation to violent events
New Auto-Interp
Negative Logits
externalToEVAOnly
-0.84
apter
-0.76
è£ıç
-0.72
å¹
-0.68
opus
-0.67
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.66
illon
-0.66
utory
-0.65
mits
-0.64
uracy
-0.63
POSITIVE LOGITS
itudinal
1.12
er
1.01
est
0.90
ly
0.88
enger
0.87
erness
0.85
ingham
0.85
ings
0.81
edo
0.81
fast
0.80
Activations Density 0.024%