INDEX
Explanations
political figures' names
names of prominent political figures
New Auto-Interp
Negative Logits
Mehran
-0.61
semble
-0.59
posted
-0.58
streamed
-0.58
ASA
-0.57
Veter
-0.57
Honolulu
-0.56
Indianapolis
-0.56
Norn
-0.55
Captain
-0.55
POSITIVE LOGITS
omics
1.30
mania
1.17
ism
1.10
ian
1.06
esque
1.05
care
1.02
isms
0.95
Care
0.94
itism
0.94
istas
0.86
Activations Density 0.232%