INDEX
Explanations
words related to fruits, specifically apples
references to apples
New Auto-Interp
Negative Logits
ategory
-0.88
iltr
-0.75
uled
-0.72
enance
-0.70
USS
-0.67
Simulation
-0.67
SPONSORED
-0.67
ãĤ¸
-0.66
oller
-0.65
ocumented
-0.64
POSITIVE LOGITS
cider
1.36
apple
1.15
apple
1.11
apples
1.05
juice
1.01
fruit
1.00
baum
0.98
cone
0.95
oint
0.85
pie
0.84
Activations Density 0.023%