Hahaha, when marketing becomes policy 
My defitition is the definition that CPU designers use ....
You are right that "marketing" was often misusing the RISC/CISC definitions.
And companies like IBM came up with those definition for marketing reasons.
Today CPUs are classified as either
1)
LOAD-STORE / or REGISTER-REGISTER / or RISC =
these CPU's can only operate on Register and NOT on memory.
the opposite are
2) CISC Architecture which can operate on memory.
the x86 is actually a RISC machine! Since it's non orthogonal ISA often requires one to load data into Registers for processing and then written back to the main memory.
The x86 can generally use 1 operant from memory.
The 68k can for some operations have 2 operants in memory.
A VAX can even have 3 operants in memory.
But a RISC machine NEVER can use an operant in memory.
PPC an ARM are examples of RISC chips that have woefully complex instruction sets,
Complex?
No, not really...
They have many instructions and some instruction also take more than 1 cycle.
But their instructions are all regular and not complex neither in execution nor in decoding.
The main complixity that RISC took away from CISC was the decoding complexity.
The 68k did support instruction up to 10 byte length. - This was difficult enough to decode.
Since the 68020 Motorola broke did record and supported even over 20 bytes - This complixity was a problem which made making 68k fast really difficult.
The common dominiator of CISC chips are the complex address modes.
And that instruction can operate on memory and sometimes even can have more than one operant in memory made the instruction very complex to decode.
So complex that it became very challanging for CPU developers
to invent decoders which are able to decoder more than 1 instruction per cycle.
Not all CISC chips are equally complex to decode.
68000 was complex but instruction size could be determined with decoding of 16bits. This is OK.
The Z chip. IBMs CISC mainframe design - its instruction size can be decoder by evaluating only 2 bits. This is nice.
While the added address modes of the 68020+ make it neccessary to look at 10 bytes = 80bit to be able to decode it length. This change was a real big mistake by Motorola.
If you want to understand wheter a chip is CISC or RISC then simply check a few points:
Can the chip support 3, 2 or even 1 operants in memory?
RISC can't.
Does the chip allow updating only parts of their registers in a BYTE/WORD/LONGWORD fashion?
RISC don't.
Does it allow full size immediates encoded in their instructions?
Like 32bit or 64bit immedates?
RISC don't.
Of course not all chips in one category are the same,
VAX was more CISCy then all other CISC chips.
The VAX could read two operants from memory, do an operation with them and store the result as third operant again to memory. And this all in a single instruction.
The 68k can use 2 memory operants only a few instructions.
Them being ADDX,SUBX,CMPM, MOVE, ABCD, SBCD
The x86 generally only allows 1 memory operant.
RISC chips do not allow even 1 memory operant.
Coding RISC chips is different han coding CISC chips.
With CISC chips you can use immediates easily.
With RISC chips you to can only use small immediates embedded in your instructions stream.
All bigger constanst you have to reference over a pointer from memory.
All bigger offsets from a pointer you can not include in your instruction but you have to create with extra instructions. The default GCC compiler setting is big data model nowadays.
This means that pointers to immediates are per default generated with 2 extras instructions.
This means for something which looks "simple" to a CISC developer as
ADD #64bitimmediate,Register
On POWER the per default generated code is
2 instruction to generate a 32bit offset
1 instruction to load data from offset plus base pointer into a tempregister
1 instruction to add the tempregister to the register.
If you look at generated code you see much more examples for this.
You see such code very often when you compare SSE instructions with POWER instructions.
x86 needing 1 instruction and directly referencing 1 operant from memory.
POWER needing 4 instruction to do exactly the same work.
You also see this with typical integer code.
Good CISC chips like 68060 or modern x86 are clock by clock very efficient in integer operations.
Its very difficult to keep their pace with RISC chips as RISC chips need to execute much more instructions to do the same amount of work.