INDEX
Explanations
instances of the word "that" followed by a certain context
instances of the word "that" in various contexts
New Auto-Interp
Negative Logits
inx
-0.75
ulner
-0.74
quished
-0.60
inders
-0.60
eners
-0.60
ricanes
-0.57
icons
-0.57
listed
-0.57
Pwr
-0.56
yip
-0.56
POSITIVE LOGITS
regard
1.66
vein
1.39
case
1.29
circumstance
1.28
respect
1.27
manner
1.23
context
1.22
respects
1.21
regards
1.20
instance
1.19
Activations Density 0.082%