INDEX
Explanations
names of political figures
capital letters that likely represent names
New Auto-Interp
Negative Logits
Coral
-0.67
Cecil
-0.63
Cricket
-0.60
Pixar
-0.59
Devon
-0.59
Telegraph
-0.59
pleasure
-0.59
dispatch
-0.58
Windsor
-0.58
Mercury
-0.57
POSITIVE LOGITS
akh
1.10
awar
0.94
abis
0.91
ulk
0.91
oub
0.86
issan
0.85
inav
0.84
uty
0.84
ı
0.83
ikh
0.82
Activations Density 0.104%