INDEX
Explanations
names or terms related to honorary titles or positions
references to emergencies and urgent situations
New Auto-Interp
Negative Logits
Ogre
-0.74
cooker
-0.68
Strauss
-0.66
bill
-0.63
Pilgrim
-0.63
burgers
-0.62
guiActiveUnfocused
-0.60
underest
-0.60
Yoga
-0.59
wana
-0.59
POSITIVE LOGITS
gency
1.08
itus
1.01
gent
1.01
andum
0.99
oing
0.92
acy
0.92
emer
0.91
oint
0.89
ira
0.88
itable
0.87
Activations Density 0.014%