INDEX
Explanations
references to specific names
references to specific individuals, particularly names associated with notable public figures or characters
New Auto-Interp
Negative Logits
ioch
-0.68
rax
-0.67
fielded
-0.66
unct
-0.65
underest
-0.62
ed
-0.62
sterdam
-0.62
roman
-0.61
underestimated
-0.61
rifice
-0.60
POSITIVE LOGITS
Dunham
1.19
Gw
1.19
Lena
1.10
Scotia
0.86
estone
0.83
istor
0.77
Chu
0.75
Tina
0.74
Juice
0.73
Turner
0.72
Activations Density 0.006%