Asked  7 Months ago    Answers:  5   Viewed   119 times

I have been working with a string[] array in C# that gets returned from a function call. I could possibly cast to a Generic collection, but I was wondering if there was a better way to do it, possibly by using a temp array.

What is the best way to remove duplicates from a C# array?

 Answers

82

You could possibly use a LINQ query to do this:

int[] s = { 1, 2, 3, 3, 4};
int[] q = s.Distinct().ToArray();
Tuesday, June 1, 2021
 
qitch
answered 7 Months ago
83

You can do it via:

//$rgData comes from preg_match_all
$rgResult = array_map('array_unique', $rgData);
Saturday, May 29, 2021
 
Crontab
answered 7 Months ago
45

Let's get the important stuff out of the way first: arrays are not pointers. Array types and pointer types are completely different things and are treated differently by the compiler.

Where the confusion arises is from how C treats array expressions. N1570:

6.3.2.1 Lvalues, arrays, and function designators

...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is unde?ned.

Let's look at the following declarations:

int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int *parr = arr;

arr is a 10-element array of int; it refers to a contiguous block of memory large enough to store 10 int values. The expression arr in the second declaration is of array type, but since it is not the operand of & or sizeof and it isn't a string literal, the type of the expression becomes "pointer to int", and the value is the address of the first element, or &arr[0].

parr is a pointer to int; it refers to a block of memory large enough to hold the address of a single int object. It is initialized to point to the first element in arr as explained above.

Here's a hypothetical memory map showing the relationship between the two (assuming 16-bit ints and 32-bit addresses):

Object           Address         0x00  0x01  0x02  0x03
------           -------         ----------------------
   arr           0x10008000      0x00  0x00  0x00  0x01
                 0x10008004      0x00  0x02  0x00  0x03
                 0x10008008      0x00  0x04  0x00  0x05
                 0x1000800c      0x00  0x06  0x00  0x07
                 0x10008010      0x00  0x08  0x00  0x09
  parr           0x10008014      0x10  0x00  0x80  0x00

The types matter for things like sizeof and &; sizeof arr == 10 * sizeof (int), which in this case is 20, whereas sizeof parr == sizeof (int *), which in this case is 4. Similarly, the type of the expression &arr is int (*)[10], or a pointer to a 10-element array of int, whereas the type of &parr is int **, or pointer to pointer to int.

Note that the expressions arr and &arr will yield the same value (the address of the first element in arr), but the types of the expressions are different (int * and int (*)[10], respectively). This makes a difference when using pointer arithmetic. For example, given:

int arr[10] = {0,1,2,3,4,5,6,7,8,9};
int *p = arr;
int (*ap)[10] = &arr;

printf("before: arr = %p, p = %p, ap = %pn", (void *) arr, (void *) p, (void *) ap);
p++;
ap++;
printf("after: arr = %p, p = %p, ap = %pn", (void *) arr, (void *) p, (void *) ap);

the "before" line should print the same values for all three expressions (in our hypothetical map, 0x10008000). The "after" line should show three different values: 0x10008000, 0x10008002 (base plus sizeof (int)), and 0x10008014 (base plus sizeof (int [10])).

Now let's go back to the second paragraph above: array expressions are converted to pointer types in most circumstances. Let's look at the subscript expression arr[i]. Since the expression arr is not appearing as an operand of either sizeof or &, and since it is not a string literal being used to initialize another array, its type is converted from "10-element array of int" to "pointer to int", and the subscript operation is being applied to this pointer value. Indeed, when you look at the C language definition, you see the following language:

6.5.2.1 Array subscripting
...
2 A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

In practical terms, this means you can apply the subscript operator to a pointer object as though it were an array. This is why code like

int foo(int *p, size_t size)
{
  int sum = 0;
  int i;
  for (i = 0; i < size; i++)
  {
    sum += p[i];
  }
  return sum;
}

int main(void)
{
  int arr[10] = {0,1,2,3,4,5,6,7,8,9};
  int result = foo(arr, sizeof arr / sizeof arr[0]);
  ...
}

works the way it does. main is dealing with an array of int, whereas foo is dealing with a pointer to int, yet both are able to use the subscript operator as though they were both dealing with an array type.

It also means array subscripting is commutative: assuming a is an array expression and i is an integer expression, a[i] and i[a] are both valid expressions, and both will yield the same value.

Tuesday, June 1, 2021
 
StampyCode
answered 7 Months ago
61

Starting from Thomas Levesque's suggestion I've built a simple ArraySegmentWrapper<T> class to use in this way:

static void Main(string[] args)
{
    int[] arr = new int[10];
    for (int i = 0; i < arr.Length; i++)
        arr[i] = i;

    // arr = 0,1,2,3,4,5,6,7,8,9

    var segment = new ArraySegmentWrapper<int>(arr, 2, 7);
    segment[0] = -1;
    segment[6] = -1;
    // now arr = 0,1,-1,3,4,5,6,7,-1,9


    // this prints: -1,3,4,5,6,7,-1
    foreach (var el in segment)
        Console.WriteLine(el);
}

Implementation:

public class ArraySegmentWrapper<T> : IList<T>
{
    private readonly ArraySegment<T> segment;

    public ArraySegmentWrapper(ArraySegment<T> segment)
    {
        this.segment = segment;
    }

    public ArraySegmentWrapper(T[] array, int offset, int count)
        : this(new ArraySegment<T>(array, offset, count))
    {
    }

    public int IndexOf(T item)
    {
        for (int i = segment.Offset; i < segment.Offset + segment.Count; i++)
            if (Equals(segment.Array[i], item))
                return i;
        return -1;
    }

    public void Insert(int index, T item)
    {
        throw new NotSupportedException();
    }

    public void RemoveAt(int index)
    {
        throw new NotSupportedException();
    }

    public T this[int index]
    {
        get
        {
            if (index >= this.Count)
                throw new IndexOutOfRangeException();
            return this.segment.Array[index + this.segment.Offset];
        }
        set
        {
            if (index >= this.Count)
                throw new IndexOutOfRangeException();
            this.segment.Array[index + this.segment.Offset] = value;
        }
    }

    public void Add(T item)
    {
        throw new NotSupportedException();
    }

    public void Clear()
    {
        throw new NotSupportedException();
    }

    public bool Contains(T item)
    {
        return this.IndexOf(item) != -1;
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        for (int i = segment.Offset; i < segment.Offset + segment.Count; i++)
        {
            array[arrayIndex] = segment.Array[i];
            arrayIndex++;
        }
    }

    public int Count
    {
        get { return this.segment.Count; }
    }

    public bool IsReadOnly
    {
        get { return false; }
    }

    public bool Remove(T item)
    {
        throw new NotSupportedException();
    }

    public IEnumerator<T> GetEnumerator()
    {
        for (int i = segment.Offset; i < segment.Offset + segment.Count; i++)
            yield return segment.Array[i];
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

EDIT :

As pointed out by @JeppeStigNielsen in the comments, since .NET 4.5 ArraySegment<T> implements IList<T>

Thursday, July 29, 2021
 
mario
answered 5 Months ago
98

Leaves the original intact, only loops once, preserves order:

nameArray := ["Chris","Joe","Marcy","Chris","Elina","Timothy","Joe"]

trimmedArray := trimArray(nameArray)

trimArray(arr) { ; Hash O(n) 

    hash := {}, newArr := []

    for e, v in arr
        if (!hash.Haskey(v))
            hash[(v)] := 1, newArr.push(v)

    return newArr
}

An alternative to using the haskey method, would be to check a value in our hash object. This may prove to be more efficient and faster, but I'll leave the testing to you.

trimArray(arr) { ; Hash O(n) 

    hash := {}, newArr := []

    for e, v in arr
        if (!hash[v])
            hash[(v)] := 1, newArr.push(v)

    return newArr
}

Edit: Initially I wasn't going to test, but I got curious as well as tired of waiting on the OP. The results don't surprise me much:

enter image description here

What we are seeing here is the average execution times for 10,000 tests, the lower the number, the faster the task was computed. The clear winner is my script variation without Haskey Method, but only by tiny margin! All in the other methods were doomed, being that they are not linear solutions.

Test Code is here:

setbatchlines -1 

tests := {test1:[], test2:[], test3:[], test4:[]}

Loop % 10000 {
    nameArray := ["Chris","Joe","Marcy","Chris","Elina","Timothy","Joe"]

    QPC(1)

    jimU(nameArray)

    test1 := QPC(0), QPC(1)

    AbdullaNilam(nameArray)

    test2 := QPC(0), QPC(1)

    ahkcoderVer1(nameArray)

    test3 := QPC(0), QPC(1)

    ahkcoderVer2(nameArray)

    test4 := QPC(0)

    tests["test1"].push(test1), tests["test2"].push(test2)
    , tests["test3"].push(test3), tests["test4"].push(test4)
}

scripts := ["Jim U         ", "Abdulla Nilam  "
            , "ahkcoder HasKey", "ahkcoder Bool  " ]

for e, testNums in tests ; Averages Results
    r .= "Test Script " scripts[A_index] "`t:`t" sum(testNums) / 10000 "`n"


msgbox % r

AbdullaNilam(names) {

    for i, namearray in names
        for j, inner_namearray in names
            if (A_Index > i && namearray = inner_namearray)
                names.Remove(A_Index)
    return names
}

JimU(nameArray) {
  hash := {}
  for i, name in nameArray
    hash[name] := null

  trimmedArray := []
  for name, dummy in hash
    trimmedArray.Insert(name)

  return trimmedArray
}

ahkcoderVer1(arr) { ; Hash O(n) - Linear

    hash := {}, newArr := []

    for e, v in arr
        if (!hash.Haskey(v))
            hash[(v)] := 1, newArr.push(v)

    return newArr
}

ahkcoderVer2(arr) { ; Hash O(n) - Linear

    hash := {}, newArr := []

    for e, v in arr
        if (!hash[v])
            hash[(v)] := 1, newArr.push(v)

    return newArr
}

sum(arr) {
    r := 0
    for e, v in arr
        r += v
    return r
}

QPC(R := 0) ; https://autohotkey.com/boards/viewtopic.php?t=6413
{
    static P := 0, F := 0, Q := DllCall("QueryPerformanceFrequency", "Int64P", F)
    return ! DllCall("QueryPerformanceCounter", "Int64P", Q) + (R ? (P := Q) / F : (Q - P) / F) 
}
Saturday, November 27, 2021
 
g00fy
answered 1 Week ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share