INDEX
Negative Logits
RUnlock
-0.65
houſe
-0.64
himſelf
-0.62
itſelf
-0.62
themſelves
-0.61
ſtate
-0.61
myſelf
-0.61
preſent
-0.58
Eſ
-0.57
anskje
-0.57
POSITIVE LOGITS
ity
1.02
ally
1.02
ary
0.94
ly
0.86
ality
0.82
als
0.81
ening
0.79
ial
0.78
ing
0.77
al
0.75
Activations Density 0.062%