INDEX
Explanations
names of people
names of individuals and specific identifiers associated with people
New Auto-Interp
Negative Logits
Banking
-0.69
fulfillment
-0.68
circuits
-0.68
distance
-0.66
spl
-0.64
fallacy
-0.63
metaphors
-0.63
appointment
-0.63
INESS
-0.62
disguise
-0.62
POSITIVE LOGITS
cia
1.13
imir
1.11
yna
1.10
chuk
1.07
ko
1.05
chev
1.00
chel
0.98
iq
0.97
nik
0.97
nikov
0.97
Activations Density 0.233%