INDEX
Explanations
phrases indicating a decision or evaluation
occurrences of the verb "to be" in various forms
New Auto-Interp
Negative Logits
asers
-0.67
andise
-0.66
iser
-0.63
tor
-0.62
Comments
-0.62
itcher
-0.60
clip
-0.60
meanwhile
-0.60
Cheong
-0.60
atre
-0.59
POSITIVE LOGITS
happening
1.03
raining
0.99
nt
0.94
impossible
0.93
done
0.87
okay
0.86
meant
0.85
gonna
0.81
undet
0.80
feasible
0.79
Activations Density 0.341%