Cache (pronounced cash) is a small amount of memory that is a part of the CPU. It is closer to the CPU than RAM (Main Memory) is, in fact on the CPU chip.
Cache is used to temporarily store any common instructions and data that the CPU is likely to reuse. It often stores the most recent data and instructions used in case they are needed again.
As part of the Fetch stage of the Fetch - Decode - Execute cycle, the CPU automatically checks in cache for the instructions before going to RAM for them.
Having regularly used instruction and data in Cache makes the cycle quicker. This is because Cache is made of Static RAM, (SRAM) memory chips rather than Dynamic RAM (DRAM) . SRAM is much quicker than DRAM but is a lot more expensive too.
Without Cache memory there would be a performance bottleneck where the CPU would be sitting around doing nothing whilst it waits for the RAM to pass it the data and instructions it needs. The CPU "works" much faster than RAM. Cache is sort of a middle man between the two which can work quicker than main Memory and helps keep the CPU busy.
Cache is organised into Levels - Level 1 : L1, Level 2: L2, Level 3: L3. The L1 cache is usually part of the CPU chip set, it is the fastest level but is also the smallest - often between 8kb and 64kb. L2 and L3 are extra caches that are bigger (but slower) than L1. L2 is sometimes part of the CPU.
Image Credit: Kbbuch / CC BY-SA
Video explaining Cache and why it is needed