INDEX
Negative Logits
ĪĴ
-0.68
difficulty
-0.63
Fairfax
-0.63
Done
-0.62
Pradesh
-0.61
tert
-0.60
surplus
-0.58
sled
-0.58
atche
-0.57
Taste
-0.57
POSITIVE LOGITS
heit
1.01
andise
0.97
iland
0.92
inki
0.87
arters
0.85
oad
0.84
encer
0.84
ê
0.83
ences
0.82
ounters
0.81
Activations Density 0.159%