INDEX
Explanations
references to the primary topic or main theme in a given context
references to a central theme or main topic in various contexts
New Auto-Interp
Negative Logits
ania
-0.81
asca
-0.78
idden
-0.71
OUGH
-0.68
named
-0.68
ston
-0.66
apons
-0.64
ibrary
-0.63
mia
-0.62
ccess
-0.61
POSITIVE LOGITS
starter
0.87
rite
0.84
squarely
0.80
fulness
0.78
Goal
0.76
focus
0.76
focuses
0.76
ivation
0.75
focused
0.75
tenance
0.72
Activations Density 0.034%