INDEX
Explanations
lists of items, such as countries, representatives, or movies
phrases indicating lists
New Auto-Interp
Negative Logits
ester
-0.72
entimes
-0.72
imet
-0.71
gypt
-0.69
iva
-0.67
imeter
-0.65
breeze
-0.64
athy
-0.64
lycer
-0.63
ashtra
-0.62
POSITIVE LOGITS
sorts
0.89
entries
0.85
accomplishments
0.79
ãĥ¼ãĥ³
0.76
items
0.75
Topics
0.74
ingredients
0.72
celebrities
0.71
keywords
0.71
grievances
0.70
Activations Density 0.107%