INDEX
Explanations
instances of the word "first"
occurrences of the word "first."
New Auto-Interp
Negative Logits
tics
-0.74
still
-0.71
<?
-0.71
holes
-0.67
bara
-0.67
Mub
-0.67
mins
-0.65
lass
-0.65
aucus
-0.63
llers
-0.62
POSITIVE LOGITS
foray
1.44
outing
1.08
appearance
1.00
stint
0.97
attempt
0.96
encounter
0.93
trip
0.90
impressions
0.90
visit
0.90
birthday
0.89
Activations Density 0.107%