INDEX
Explanations
the word "UR."
repeated instances of the term "UR."
New Auto-Interp
Negative Logits
Schultz
-0.74
Eb
-0.71
BaseType
-0.70
Schn
-0.69
etts
-0.67
Sands
-0.64
olean
-0.64
Kissinger
-0.63
Ghana
-0.61
hed
-0.61
POSITIVE LOGITS
UR
1.14
POSE
1.05
confir
1.04
GER
0.99
BLE
0.97
ARCH
0.92
pees
0.91
RR
0.90
DAY
0.88
pee
0.88
Activations Density 0.009%