INDEX
Explanations
the word "then" followed by a numerical value
New Auto-Interp
Negative Logits
toe
-0.74
md
-0.61
illon
-0.61
Parables
-0.60
Shore
-0.59
acts
-0.59
des
-0.57
Peninsula
-0.57
triangles
-0.56
ting
-0.54
POSITIVE LOGITS
-'
0.90
-,
0.77
Yugoslav
0.74
-
0.73
proceeded
0.72
Reviewer
0.71
arily
0.67
-.
0.66
ose
0.65
ally
0.65
Activations Density 0.039%