INDEX
Explanations
phrases asking about the meaning or implications of a given situation or concept
references to the terms "this" and "that" in relation to questioning meaning or significance
New Auto-Interp
Negative Logits
mens
-1.00
mop
-0.91
culosis
-0.81
geons
-0.81
iencies
-0.81
ahime
-0.80
vic
-0.80
dos
-0.79
vard
-0.78
anas
-0.78
POSITIVE LOGITS
entails
1.03
mean
0.96
meant
0.92
entail
0.90
means
0.88
Means
0.82
implies
0.82
imply
0.78
newfound
0.78
boils
0.77
Activations Density 0.064%