Git :: Removing files from all commits

May 4th, 2012

Alright… this is just a tiny hint on the process I used to nuke some committed files from all commit history

git filter-branch –index-filter “git rm -rf –cached –ignore-unmatch my_files” HEAD

rm -rf .git/refs/original/

git reflog expire –all

git gc –aggressive –prune

git origin +master

And here comes the explanation:

# git filter-branch –index-filter “git rm -rf –cached –ignore-unmatch my_files” HEAD

Our action is to rewrite our branch, hence we need to use the top level “filter-branch” command.

–index-filter: there is no need to checkout the current branch, so we can move faster and simply filter in which we issue the “git rm -rf …”

git-rm: quiet obvious

–cached: Only match the paths in the index – leaving modified matching files untouched
–ignore-unmatch: result 0 status in any case if no match

HEAD: obviously, we are working on our last commit

# rm -rf .git/refs/original/

Even with our branch rewrite from earlier, we still have a backup in refs/original, so we need to delete it

# git reflog expire –all

Here is where it gets interesting, you see, each action performed inside git is “backed up” in the reflog. Think of it as a safety net, which is an inventory hash of all the points you been at for each commit. So it is possible to restore the files commit from the reflog, hence “expire –all”.

# git gc –aggressive –prune

Oh my… gc what? “garbage collector?”, well actually we have rewritten the branch, purged the reflog and we are left with a lot of unused objects, so time to save some disk space and clean up

# git push origin +master

Well… I don’t want to merge and then push, that would be deafeating the whole purpose of my previous actions and since no one has yet pulled from this repo, so we need to force the non-fast-forward since we are pretty much breaking the objects inheritance, hence the “+master”

 

 

Bookmark and Share

The so-called Skype SDK IP leaks

Apr 29th, 2012

For the last few days, there has been a buzzing news in the community, following the recent discovery of a so-called information leak in the skype SDK. zhovner@github, published a python code sample “exploiting this vulnerability” https://github.com/zhovner/Skype-iplookup/ using a de-obfuscated SDK and published a demo site @http://skype-ip-finder.tk/. More related information on the skype-open-source project can be found @ http://skype-open-source.blogspot.de/

So to sump-up, the “so-called leak” takes place by:

1. having “debug logging enabled” in the hi-jacked SDK

2. triggering a lookup information on a user who happens to be online, such as seeing his vcard

and blam! both private and public ip addresses of this user are shown in the log.

 

OMG cat, Surprised cat (lolcat) says OMG

 

Now, I say “so-called” leak, because truth be told, this isn’t leak nor a bug and here is why:

The skype protocol is merely at its core a P2P protocol – while it has never standardized and is fully proprietary, a minimum understanding on how P2P architecture operates clearly explain why the skype client passes such information, after all, there aren’t truly such things as relay servers in a P2P network,  clients in this case are exchanging the VoIP traffic directly and doing all the processing. Skype’s communication architecture constitutes of 3 types of nodes, “login servers”, “supernodes” and “standard nodes”.

To put it as simple as possible, “login servers” are the servers you connect to authenticate, “standard nodes” are clients (skype clients) sitting behind a NAT firewall and “supernodes” (most interesting for this article) are simply the opposite of “standard nodes”, in other words, they are directly assigned a public IP address and no firewall rules are blocking a direct connection to the skype random port client.

For a call to take place between 2 “standard nodes”, they must have a direct non NATed connection, since “standard nodes” sit behind a NAT/firewall, a direct connection between the 2 hosts is therefore not possible, to overcome this skype uses a technique called “UDP hole-punching” and this is where “supernodes” come into play.

 

UDP hole-punching “simplified”

UDP hole-punching is a simple technique which persists in somewhat convincing the firewall that the incoming UDP packets are responses of an already established connection. Now I say “established”, but remember, UDP is not connection oriented, so we basically have no concept of sessions and all that nasty-SYN-overhead :-P , but we have a NAT recorded session which we could exploit.

To clarify: assume we have 2 hosts A, B each with a NAT/firewall and a random server called X. UDP hole-punching works as follow:

1. A and B connect to X (whether that be UDP or TCP)

2. Through the connection, X is aware of A and B’s public NAT and source ports

3. X communicate to A, B’s public NAT and source port and vice-versa to B

4. A sends a UDP packet with its previous source port to B’s public NAT

4. Naturally B’s firewall will reject the UDP packet, but as A sent that packet to B’s NAT IP, A’s firewall recorded the NAT session with the source port used by A and here is the punchole

5. B is aware of A’s source port from the exchange with server X, so B sends a packet A’s NAT IP with the destination source as A’s source port, since the NAT session was recorded in step 4, blam! the firewall forwards B’s UDP packet to A.

Now that you have a basic understanding of UDP hole-punching, keep in mind that a supernode will act as a relay in the case that A and B are still unable to communicate in cases where the firewalls are randomizing the source IP of the clients when the NAT process takes place.

Finally, you may ask:

“Ok, I get it! A needs to know B’s public IP and vice-versa, but why the private IP addresses, as seen in the so-called skype SDK IP leak”.

Well, you are right, why is Skype communicating the private IP address? Well, it is like answering, why would 2 hosts communicate over their NAT IPs if they could directly communicate through their privately assigned IP addresses? in other words, if 2 clients are located on 2 routeable LANs, it would be faster for them to exchange internally than externally and hence the reasons why during the hole-punching process, both A and B reported not only their public IPs but also their private IPs which server X exchanged between the 2 clients, both ending up knowing the public and private IPs as well as source ports of the opposite peer.

So in the case of skype, it is pretty obvious now what supernodes are useful for, further than being simple directory service to lookup contacts.

So… there is no leak, just some guys who figured out how to enable “printf” in an SDK which hooks to your client :)

Cool CAt

Needless to say, if you are curious as the kind of information which circumvent over skype  – read more @ http://www.scribd.com/doc/69593950/Skype

 

 

Bookmark and Share
Tags: , , ,

Mercedes Museum

Apr 22nd, 2012

Here are a few snaps from my visit at the Mercedes Museum this weekend in Stuttgart – for those visiting Germany, it is definitely worth the stop. It was both cultivating and a lot of fun :)

Bookmark and Share

Lower initial TCP RTO – Redhat kernel patch

Feb 24th, 2012

I have recently back-ported the rfc2988bis changes (initRTO=1 and fallack) to the redhat 2.6.32 kernel – find the patch on my github account at @ https://github.com/alouche/redhat-2.6.32-kernel-patches/blob/master/rfc2988bis.patch

On short lived connections with a lot of 3WHS, a lower initial RTO will improve 3WHS latency by 2*2000ms*X% (X% being the average of packet drops of a specific route). For further technical details, refer to https://www.ietf.org/proceedings/77/slides/tcpm-1.pdf

 

 

Bookmark and Share

Linux CFS Algorithm and Virtual Runtime

Feb 4th, 2012

Since the 2.6.23 kernel, the Linux kernel process scheduler previously O(1) was replaced by CFS – a Completely Fair Scheduler.

CFS uses a red-black tree as data-structure and unlike previous Unix process scheduler does not account a traditional time slice of process execution but accounts what is referred as the process virtual runtime, expressed in nanoseconds (as opposed to Hz or jiffies). The usage of a self-balanced tree as the red-black tree allows for a lookup of O(log n) time per the height of the tree, but more on this later.

Virtual Runtime

The Virtual Runtime, declaration “vruntime” is a variable part of the process inherited C structure as defined in linux/sched.h which accounts for the time a process run in relation to the number of running processes. Each process holds a process descriptor “task_struct” dynamically allocated via the slab allocator.

struct task_struct {
         volatile long state;    /* -1 unrunnable, 0 runnable, >0 stopped */
         void *stack;
         atomic_t usage;
         unsigned int flags;     /* per process flags, defined below */
         unsigned int ptrace;
(...)
#endif
         int on_rq;
         int prio, static_prio, normal_prio;
         unsigned int rt_priority;
         const struct sched_class *sched_class;
         struct sched_entity se;

The “sched_entity” structure linked in “task_struct” is defined as

struct sched_entity {
         struct load_weight      load;           /* for load-balancing */
         struct rb_node          run_node;
         struct list_head        group_node;
         unsigned int            on_rq;

         u64                     exec_start;
         u64                     sum_exec_runtime;
         u64                     vruntime;
         u64                     prev_sum_exec_runtime;

         u64                     nr_migrations;

#ifdef CONFIG_SCHEDSTATS
         struct sched_statistics statistics;
#endif

#ifdef CONFIG_FAIR_GROUP_SCHED
         struct sched_entity     *parent;
         /* rq on which this entity is (to be) queued: */
         struct cfs_rq           *cfs_rq;
         /* rq "owned" by this entity/group: */
         struct cfs_rq           *my_q;
#endif
};

While the other variables also play a role in CFQ decision’s algorithm, vruntime is by far the core variable which needs more attention as to understand the scheduling decision process. As stated earlier, CFQ does not account time slice as did previous schedulers, the vruntime is evaluated

The vruntime for each process is calculated as followed:

1- Compute the time spent by the process on the CPU
2- Weight the computed running time against the number of runnable processes

The “update_curr” function is responsible to calculate the running time spent by the process, which stores the value into an unsigned long variable “delta_exec”. “delta_exec” is computed as followed:

 delta_exec = (unsigned long)(now - curr->exec_start);

with

 struct sched_entity *curr = cfs_rq->curr;
 u64 now = rq_of(cfs_rq)->clock_task;

“delta_exec” is passed unto __update_curr

__update_curr(struct cfs_rq *cfs_rq, struct sched_entity *curr,
               unsigned long delta_exec)
{
         unsigned long delta_exec_weighted;

         schedstat_set(curr->statistics.exec_max,
                       max((u64)delta_exec, curr->statistics.exec_max));

         curr->sum_exec_runtime += delta_exec;
         schedstat_add(cfs_rq, exec_clock, delta_exec);
         delta_exec_weighted = calc_delta_fair(delta_exec, curr);

         curr->vruntime += delta_exec_weighted;
         update_min_vruntime(cfs_rq);

#if defined CONFIG_SMP && defined CONFIG_FAIR_GROUP_SCHED
         cfs_rq->load_unacc_exec_time += delta_exec;
#endif
}

calc_delta_fair” will return the weighted value of the process’s calculated runtime “delta_exec” in relation to number of runnable processes. The sub function used for that calculation is “calc_delta_mine”.

To keep in mind:

  • unsigned long delta_exec, is the computed running time of the process
  • unsigned long weight, is the nice value of the process
  • struct load_weight *lw, composed of 2 unsigned long “weight” and  “inv_weigh” (lw->weight and lw->inv_weight)
calc_delta_mine(unsigned long delta_exec, unsigned long weight,
                 struct load_weight *lw)
{
         u64 tmp;

  /*
  * weight can be less than 2^SCHED_LOAD_RESOLUTION for task group sched
  * entities since MIN_SHARES = 2. Treat weight as 1 if less than
  * 2^SCHED_LOAD_RESOLUTION.
  */
         if (likely(weight > (1UL << SCHED_LOAD_RESOLUTION)))
                 tmp = (u64)delta_exec * scale_load_down(weight);
         else
                 tmp = (u64)delta_exec;

         if (!lw->inv_weight) {
                 unsigned long w = scale_load_down(lw->weight);

                 if (BITS_PER_LONG > 32 && unlikely(w >= WMULT_CONST))
                         lw->inv_weight = 1;
                 else if (unlikely(!w))
                         lw->inv_weight = WMULT_CONST;
                 else
                         lw->inv_weight = WMULT_CONST / w;
         }


         if (unlikely(tmp > WMULT_CONST))
                tmp = SRR(SRR(tmp, WMULT_SHIFT/2) * lw->inv_weight,
                         WMULT_SHIFT/2);
         else
                 tmp = SRR(tmp * lw->inv_weight, WMULT_SHIFT);

         return (unsigned long)min(tmp, (u64)(unsigned long)LONG_MAX);
}

Once the vruntime is computed, it is stored into the inherited sched_entity structure of the process by calling

         curr->vruntime += delta_exec_weighted;

in the previously seen __update_curr function, we also notice the call

         update_min_vruntime(cfs_rq);

What this does is to compute the smallest vruntime of all runnable processes and store it at the leftmode node of the red-black tree.

The CFS selection algorithm for process to be run is extremely simple, it will run its red black tree and search for the smallest vruntime value pointing to the next runnable process. In other words, “run the process which is represented by the leftmost node in the tree”, since the leftmost node constains the smallest vruntime, referred in the source code as min_vruntime.

To conclude this post, it is important to note that CFS does not walk the whole tree since min_vruntime is referenced by rb_leftmost in the cfs_rq struct (kernel/sched.c)

struct cfs_rq {
         struct load_weight load;
        unsigned long nr_running, h_nr_running;

         u64 exec_clock;
         u64 min_vruntime;
#ifndef CONFIG_64BIT
         u64 min_vruntime_copy;
#endif

         struct rb_root tasks_timeline;
         struct rb_node *rb_leftmost

as seen in this construct (kernel/sched_fair.c)

static struct sched_entity *__pick_next_entity(struct sched_entity *se)
{
         struct rb_node *next = rb_next(&se->run_node);

         if (!next)
                 return NULL;

         return rb_entry(next, struct sched_entity, run_node);
}
Bookmark and Share