INDEX
Explanations
sentences asserting a critical perspective on societal issues or individuals
New Auto-Interp
Negative Logits
ONSORED
-0.61
Heights
-0.60
favour
-0.57
ointed
-0.56
HI
-0.54
Erit
-0.54
elta
-0.53
Tamil
-0.53
Honour
-0.52
Magicka
-0.52
POSITIVE LOGITS
abouts
1.45
upon
1.15
fore
0.89
after
0.78
FORE
0.75
etheless
0.74
ain
0.73
with
0.72
'll
0.71
isn
0.70
Activations Density 0.129%