INDEX
Explanations
phrases asking for opinions or thoughts related to a specific topic
phrases asking for opinions or thoughts on various subjects
New Auto-Interp
Negative Logits
leases
-0.71
Entered
-0.68
ngth
-0.67
ccording
-0.67
////////////////////////////////
-0.66
LEASE
-0.65
////////////////
-0.65
llan
-0.64
onna
-0.62
ceased
-0.62
POSITIVE LOGITS
aisle
0.67
Brexit
0.66
RTX
0.65
respecting
0.63
Chomsky
0.61
feminism
0.60
erous
0.60
Mondays
0.60
whether
0.59
temperament
0.59
Activations Density 0.104%