INDEX
Explanations
the word "favor" or "favour" in different contexts
phrases indicating support or approval
New Auto-Interp
Negative Logits
ı
-0.69
Bam
-0.64
Wallet
-0.61
ridges
-0.59
laun
-0.58
borg
-0.57
Thumbnails
-0.57
DragonMagazine
-0.57
INFO
-0.57
ember
-0.56
POSITIVE LOGITS
itism
1.84
ited
1.06
ably
1.00
ites
0.96
ability
0.89
itive
0.84
favoring
0.83
itely
0.82
ite
0.77
iting
0.75
Activations Density 0.057%