INDEX
Explanations
phrases expressing certainty or strong opinions
verbs indicating states of being or existence
New Auto-Interp
Negative Logits
isode
-0.87
ETA
-0.77
onding
-0.76
ertodd
-0.74
)))
-0.72
))))
-0.68
wich
-0.68
imei
-0.68
rant
-0.67
ples
-0.67
POSITIVE LOGITS
raining
1.16
impossible
1.08
easier
1.05
possible
0.98
difficult
0.94
customary
0.92
advisable
0.90
incumbent
0.87
conceivable
0.85
easy
0.84
Activations Density 0.192%