INDEX
Explanations
instances of the word "Rock" in various contexts
New Auto-Interp
Negative Logits
URES
-0.78
unctions
-0.75
urers
-0.73
aver
-0.71
BILITIES
-0.67
verages
-0.66
URE
-0.66
sidel
-0.65
terday
-0.65
practicable
-0.65
POSITIVE LOGITS
ford
1.04
castle
1.03
ledge
1.01
star
1.01
cliffe
1.00
estone
0.95
ingham
0.93
stars
0.93
ete
0.92
well
0.92
Activations Density 0.006%