INDEX
Explanations
phrases indicating additional information or supplementary details
the word "Additionally" and its variations, indicating the presence of supplementary information
New Auto-Interp
Negative Logits
76561
-0.80
hearts
-0.68
venge
-0.68
enders
-0.65
Fare
-0.64
fell
-0.63
ger
-0.62
liest
-0.61
ender
-0.60
orno
-0.59
POSITIVE LOGITS
guiActiveUn
0.78
importantly
0.72
æ©Ł
0.72
noteworthy
0.69
ilitary
0.66
ally
0.65
notation
0.65
ificantly
0.64
yss
0.64
orthy
0.62
Activations Density 0.015%