How to clear cuda errors?

That is quite possible. But your posted sample code does not actually do that. Here is a sample app that implements it:

#include <stdio.h>
#include <stdlib.h>

#define INITIAL_ALLOC_SIZE (1ULL << 36)

int main (void)
{
    void *my_allocation = 0;
    unsigned long long int allocsize = INITIAL_ALLOC_SIZE;

    while ((my_allocation == 0) && (allocsize !=0)) {
        printf ("trying to allocate %llu bytes ...\n", allocsize);
        if (cudaSuccess == cudaMalloc (&my_allocation, allocsize)) {
            printf ("Success! my_allocation = %p. Freeing memory & exiting\n", 
                    my_allocation);
            cudaFree (my_allocation);
            return EXIT_SUCCESS;
        } else {
            printf ("Failed! Trying smaller size.\n");
            allocsize /= 2;
        }        
    }
    printf ("Could not allocate any memory\n");
    return EXIT_FAILURE;
}

Sample output on a system with a very low end GPU:

C:\Users\Norbert\My Programs>decreasin_alloc_size
trying to allocate 68719476736 bytes ...
Failed! Trying smaller size.
trying to allocate 34359738368 bytes ...
Failed! Trying smaller size.
trying to allocate 17179869184 bytes ...
Failed! Trying smaller size.
trying to allocate 8589934592 bytes ...
Failed! Trying smaller size.
trying to allocate 4294967296 bytes ...
Failed! Trying smaller size.
trying to allocate 2147483648 bytes ...
Failed! Trying smaller size.
trying to allocate 1073741824 bytes ...
Failed! Trying smaller size.
trying to allocate 536870912 bytes ...
Success! my_allocation = 00000005008E0000. Freeing memory & exiting

If you cannot allocate any memory at all you would want to look at the actual CUDA error code returned. For example, there could be an incompatibility between CUDA driver and CUDA runtime that renders the CUDA runtime inoperable.