INDEX
Explanations
the phrase "near future" and related terms
phrases indicating proximity in time or space
New Auto-Interp
Negative Logits
lain
-0.69
arians
-0.67
Ĥª
-0.67
Clubs
-0.66
gae
-0.65
hops
-0.64
ilus
-0.64
Females
-0.64
istry
-0.64
Tips
-0.63
POSITIVE LOGITS
sighted
1.21
shore
0.93
entimes
0.87
misses
0.85
mint
0.78
ctic
0.76
zero
0.74
unanimous
0.71
utherford
0.70
-
0.69
Activations Density 0.032%