INDEX
Explanations
phrases related to events or actions that are happening for the first time
instances of the word "first" indicating new occurrences or milestones
New Auto-Interp
Negative Logits
lang
-0.73
gery
-0.73
maps
-0.71
Ïī
-0.70
tics
-0.69
errors
-0.69
morph
-0.67
lov
-0.65
ACTION
-0.65
lah
-0.63
POSITIVE LOGITS
responders
1.07
baseman
1.00
foray
0.95
glimpse
0.86
installment
0.85
ever
0.83
attempt
0.78
batch
0.78
lady
0.78
incarnation
0.75
Activations Density 0.088%