INDEX
Explanations
names of individuals
proper nouns, particularly names and initials
New Auto-Interp
Negative Logits
å§«
-0.97
ãĥ¼ãĤ¯
-0.77
urat
-0.76
Catalonia
-0.70
ModLoader
-0.69
âĸº
-0.68
âĶĢâĶĢ
-0.68
EVA
-0.67
Pradesh
-0.66
Lanka
-0.66
POSITIVE LOGITS
aylor
0.79
iggs
0.76
isner
0.75
antz
0.73
verson
0.71
elson
0.69
aney
0.68
arkin
0.68
ahl
0.68
horn
0.67
Activations Density 0.222%