INDEX
Explanations
mentions of a specific person's name, "Koh"
proper nouns, specifically names
New Auto-Interp
Negative Logits
xual
-0.89
*/(
-0.73
phia
-0.71
PsyNetMessage
-0.70
balloons
-0.69
opausal
-0.69
conserv
-0.66
éĹĺ
-0.66
Canadiens
-0.66
ngth
-0.64
POSITIVE LOGITS
lyak
0.98
istani
0.88
aku
0.84
len
0.84
sten
0.82
ota
0.82
kus
0.81
Koh
0.81
eling
0.78
adish
0.78
Activations Density 0.017%