INDEX
Explanations
names or terms related to politics and individuals
names of people or notable individuals
New Auto-Interp
Negative Logits
ModLoader
-0.83
theless
-0.61
å§«
-0.59
Galileo
-0.58
âĶĢâĶĢ
-0.57
ãĥŁ
-0.57
etheless
-0.56
ashtra
-0.56
Fancy
-0.56
Eleanor
-0.55
POSITIVE LOGITS
antz
0.71
burn
0.71
linger
0.71
zen
0.70
lett
0.69
beck
0.68
zer
0.67
berg
0.66
erman
0.65
mann
0.65
Activations Density 0.281%