INDEX
Explanations
verbs related to action or influence
phrases indicating the role or importance of various factors or elements in different contexts
New Auto-Interp
Negative Logits
luaj
-0.67
IB
-0.65
ertation
-0.61
Shib
-0.61
ighting
-0.59
ortium
-0.57
ilings
-0.57
rians
-0.56
loss
-0.56
bey
-0.55
POSITIVE LOGITS
havoc
1.29
roles
0.99
pivotal
0.91
important
0.81
catch
0.79
crucial
0.78
into
0.78
role
0.77
prominently
0.77
key
0.77
Activations Density 0.035%