Asked  7 Months ago    Answers:  5   Viewed   37 times

I was trying to understand how Address Computation Instruction works, especially with leaq command. Then I get confused when I see examples using leaq to do arithmetic computation. For example, the following C code,

long m12(long x) {
return x*12;

In assembly,

leaq (%rdi, %rdi, 2), %rax
salq $2, $rax

If my understanding is right, leaq should move whatever address (%rdi, %rdi, 2), which should be 2*%rdi+%rdi, evaluate to into %rax. What I get confused is since value x is stored in %rdi, which is just memory address, why does times %rdi by 3 then left shift this memory address by 2 is equal to x times 12? Isn't that when we times %rdi by 3, we jump to another memory address which does not hold value x?



leaq doesn't have to operate on memory addresses, and it computes an address, it doesn't actually read from the result, so until a mov or the like tries to use it, it's just an esoteric way to add one number, plus 1, 2, 4 or 8 times another number (or the same number in this case). It's frequently "abused" for mathematical purposes, as you see. 2*%rdi+%rdi is just 3 * %rdi, so it's computing x * 3 without involving the multiplier unit on the CPU.

Similarly, left shifting, for integers, doubles the value for every bit shifted (every zero added to the right), thanks to the way binary numbers work (the same way in decimal numbers, adding zeroes on the right multiplies by 10).

So this is abusing the leaq instruction to accomplish multiplication by 3, then shifting the result to achieve a further multiplication by 4, for a final result of multiplying by 12 without ever actually using a multiply instruction (which it presumably believes would run more slowly, and for all I know it could be right; second-guessing the compiler is usually a losing game).

: To be clear, it's not abuse in the sense of misuse, just using it in a way that doesn't clearly align with the implied purpose you'd expect from its name. It's 100% okay to use it this way.

Tuesday, June 1, 2021
answered 7 Months ago

The overflow flag is set when an operation would cause a sign change. Your code is very close. I was able to set the OF flag with the following (VC++) code:

char ovf = 0;

_asm {
    mov bh, 127
    inc bh
    seto ovf
cout << "ovf: " << int(ovf) << endl;

When BH is incremented the MSB changes from a 0 to a 1, causing the OF to be set.

This also sets the OF:

char ovf = 0;

_asm {
    mov bh, 128
    dec bh
    seto ovf
cout << "ovf: " << int(ovf) << endl;

Keep in mind that the processor does not distinguish between signed and unsigned numbers. When you use 2's complement arithmetic, you can have one set of instructions that handle both. If you want to test for unsigned overflow, you need to use the carry flag. Since INC/DEC don't affect the carry flag, you need to use ADD/SUB for that case.

Wednesday, July 28, 2021
answered 5 Months ago

You can, if you "introduce" the new label in the training y set too, like this:

import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.svm import LinearSVC
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.multiclass import OneVsRestClassifier
from sklearn import preprocessing
from sklearn.metrics import accuracy_score

X_train = np.array(["new york is a hell of a town",
                "new york was originally dutch",
                "the big apple is great",
                "new york is also called the big apple",
                "nyc is nice",
                "people abbreviate new york city as nyc",
                "the capital of great britain is london",
                "london is in the uk",
                "london is in england",
                "london is in great britain",
                "it rains a lot in london",
                "london hosts the british museum",
                "new york is great and so is london",
                "i like london better than new york"])
y_train_text = [["new york"],["new york"],["new york"],["new york"],    
                ["new york"],["new york"],["london"],["london"],         
                ["new york","England"],["new york","london"]]

X_test = np.array(['nice day in nyc',
               'welcome to london',
               'london is rainy',
               'it is raining in britian',
               'it is raining in britian and the big apple',
               'it is raining in britian and nyc',
               'hello welcome to new york. enjoy it here and london too'])

y_test_text = [["new york"],["new york"],["new york"],["new york"],["new york"],["new york"],["new york"]]

lb = preprocessing.MultiLabelBinarizer(classes=("new york","london","England"))
Y = lb.fit_transform(y_train_text)
Y_test = lb.fit_transform(y_test_text)

print Y_test

classifier = Pipeline([
('vectorizer', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', OneVsRestClassifier(LinearSVC()))]), Y)
predicted = classifier.predict(X_test)
print predicted

print "Accuracy Score: ",accuracy_score(Y_test, predicted)


Accuracy Score:  0.571428571429

The key section is:

y_train_text = [["new york"],["new york"],["new york"],
                ["new york"],["new york"],["new york"],
                ["london"],["london"],["new york","England"],
                ["new york","london"]]

Where we inserted "England" too. It makes sense, because other way how can predict the classifier some label if he didn't see it before? So we created a three label classification problem this way.


lb = preprocessing.MultiLabelBinarizer(classes=("new york","london","England"))

You have to pass the classes as arg to MultiLabelBinarizer() and it will work with any y_test_text.

Thursday, September 9, 2021
Ujjawal Khare
answered 3 Months ago

Try this:

pca = PCA(n_components=8)
X_pca = pca.fit_transform(X),y)

That is, you simultaneously fit PCA to X and transform it into (1000, 8) array named X_pca. That's what you should use instead of the pca.components_

Thursday, October 21, 2021
answered 2 Months ago

The general rule for AT&T x86 assembly syntax is

displacement(offset, relative offset, multiplier) = offset + displacement + ( relative offset * multiplier)
  1. %eax refers to actual value of the register(=0x100).
  2. 0x104 refers to the value at address 0x104.
  3. $0x108 refers to the constant value 0x108.
  4. (%eax) refers to the value at address EAX, which is equivalent to 0x100(=0xFF).
  5. 4(%eax) refers to the value at address EAX+4, which is at 0x104.
  6. 9(%eax, %edx) refers to the value at address EAX+9 + EDX, which is at 0x10C.
  7. 260(%ecx, %edx) refers to the value at address ECX+260 + EDX, which is at 0x108.
  8. 0xFC(,%ecx,4) refers to the value at address (ECX*4)+0xFC, which is at 0x100.
  9. (%eax, %edx, 4) refers to the value at address (EAX+(EDX*4), which is at 0x10C.
Sunday, November 21, 2021
Jean-François Corbett
answered 3 Weeks ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :